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Abstract. We describe a new algorithm for computing exp / where / is a 
power series in C[x], If M(n) denotes the cost of multiplying polynomials of 
degree n, the new algorithm costs (2.1666 . . . + o(l))M(ra) to compute exp/ 
to order n. This improves on the previous best result, namely (2.333 . . . + 
o(l))M(n). 



The author recently gave new algorithms for computing the square root and 
reciprocal of power series in C[xJ, achieving better running time constants than 
those previously known Har09|. In this paper we apply similar techniques to the 
problem of computing exp / for a power series / £ C \x\ . Previously, the best known 
algorithm was that of van der Hoeven }vdH06l p. 6], computing g — exp(/) mod x n 
in time (7/3 + o(l))M(n), where M(n) denotes the cost of multiplying polynomials 
of degree n. We give a new algorithm that performs the same task in time (13/6 + 
o(l))M(n). 

Van der Hoeven's algorithm works by decomposing / into blocks, and solving 
g' = f'g by operating systematically with FFTs of blocks. Our starting point is 
the observation that his algorithm computes too much, in the sense that at the 
end of the algorithm, the FFT of every block of g is known. Our new algorithm 
uses van der Hoeven's algorithm to compute the first half of g, and then extends 
the approximation to the target precision using a Newton iteration due to Brent 
|Bre76] (see also |HZ04j or |Ber04j for other exponential algorithms based on a 
similar iteration). At the end of the algorithm, only the FFTs of the blocks of the 
first half of g are known. In fact, the reduction in running time relative to van der 
Hoeven's algorithm turns out to be equal to the cost of these 'missing' FFTs. 

We freely use notation and complexity assumptions introduced in Har09 . Briefly: 
'running time' always means number of ring operations in C. The Fourier transform 
of length n is denoted by T n (g), and its cost by T(n). We assume that T(2n) = 
(l/3 + o(l))Af(n) for a sufficiently dense set of integers n. For PropositionQ]below, 
we fix a block size m, and for any / e C[xJ we write / = /r i + f[i]X + f[2]X 2 + ■■■ 
where X = x m and deg/[j] < m. The key technical tool is [Har09|, Lemma 1], 
which asserts that if /,.<?£ C \x\ , k > 0, and if !F2m{f[i\) and ^ r 2m(5[i]) are known 
for < i < k, then (fg)[k] may be computed in time T(2m) + 0(m(k + 1)). 

We define a differential operator S by 5f — xf'(x), and we set Skf = X~ k 6(X k /). 
In particular S(f [0] + f m X H ) = (<5 /[o]) + ($if[i])X H . 
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Algorithm 1: Exponential 



Input: s e Z, s > 1 

/ S C[ie], / = Omodi 
9[0] = ex P(/[o]) modi 
u = exp(-/[ Q ]) mod X (= .g^ 1 mod X) 
Output: g = g [0] H h gps-i]* 2 *' 1 = exp(/) mod X 2s 

1 Compute T 2m {g[o]), ^2 m {u) 

2 for < k < s do compute F2m((8f)[k}) 

3 for 1 < k < s do 



1> - ((.9[o] + • • • + . 9[fe -i]X fc - 1 )(( ( 5/) [0] + • • • + (,5/) [fe] X fc )) [fe] 
Compute Ti m {i\j) 
<j) «— w0 mod X 
Compute ^2m(^fc ^) 
S[fe] «- 0[o](tf* V) mod X 
Compute ^2m(ff[fc]) 
10 for < k < s do q [k] <- {Sf) [k] 
n for s < k < 2s do 

^ <- ((Q[o] + ■■■ + g[fe-i]X fe_1 )(g[o] + • ■ ■ + .g[ s -i]X s_1 ))[fc] 

Compute ^ r 2m(V') 

9[fe] 4 ^ mod X 

Compute ^2m(g[fe]) 

16 for < k < s do e [k] <- <^ s g [fc+s ] - / [fc+s] 

17 for < k < s do compute J^2m(e[fc]) 

is for < k < s do g [k+s] < ((g [0] H h .g [fc _i]X fc - 1 )(£ [0] H h e [fe] X fe )) [fe] 
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Proposition 1. Algorithm]]} is correct, and runs in time (13s — A)T(2m) + 0(s 2 m). 

Proof. We first show that the loop in lines [5H5] (essentially van der Hoeven's expo- 
nential algorithm) correctly computes 

go = g mod X s = g [0] + ■■■+ g^^X^ 1 = exp(/) mod X s . 

By definition <?[ ] is correct. In the fcth iteration, assume that gipi, . . . , g[ k -i\ have 
been computed correctly. Since 5go = go{5f) mod X s we have 

(G?[0] + ■ • ■ + 9[k] X k )((Sf) [0] + ■■■ + (Sf) [k] X% k] = (Sg) [k]: 
and by construction 

(G?[o] + • • • + 3[fc „i]X fc - 1 )(( ( 5/) [0] + • • • + (Sf) [k] X k )) [k] = i>. 
Subtracting yields 

(Sg)[k] -ip = .9[fc]((5/)[o] mod X, 
and on multiplying by u we obtain 

<p = m/j mod X = {5g)[ k ]U - g [k ] (Sf)[ ]U mod X 
= {8g)[k]U + 9[k] (Su) mod X 
= 6 k (g [k ]U mod X) 
since Su = —(5f)u mod X. Therefore g^ is computed correctly in line [51 
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Next we show that lines H0HT5l correctly compute 

q = ff [0 ] + ' ' ' + Q^s-l]^- 1 = ^° mod X 2s . 

go 

Since Sgo/go mod X s = Sf mod X s , line 1 101 correctly computes m i, . . . , ?[«— i]. The 
loop in lines fTTHTSl computes gr s i , . . . , Qps— l] using a similar strategy to the division 
algorithm in |vdH06| p. 6]. Namely, in the kih iteration, assume that qm\, . . . , q\k-i] 
are correct. Then 

((?[0] + ' ' ' + ?[fc-i]** -1 )(ff|p] + • • • + £/[ s -i]^ s_1 ))w - i> 

and 

((?[o] H 1- q[k]X k )(g [0] H h5[ s -i]X s_1 )) w = [qg )[k] = ($9o)[k] = 

since deg(Sga) < sm and k > s. Subtracting, we obtain gm\Q[k] — ~tp mod X, so 
grw is computed correctly in line 1141 (Note that the transforms of qm\, . . . , q\ s —i] 
used in line 1121 are already known, since they were computed in line [2]) 
At this stage we have 

^ mod X 2s = q = Sf + S(eX s ) 
9o 

for some e = emi + • • • + £[ S _ 1 ]X S_1 . Line [TBI computes the blocks of e. Then by 
logarithmic integration, we have 

go = exp(/) exp(eX s ) mod X 2s , 

so 

exp(/) mod X 2s = g exp(-eX s ) mod X 2s = g (l - eX s ) mod X 2s . 
Line [TH] multiplies out the latter product to compute the remaining blocks of g. 

We now analyse the complexity. Each iteration of linesfflfTSIandfTSlcosts T(2m) + 
0(m(k + 1)) according to |Har09| Lemma 1]; their total contribution is therefore 
(3s — \)T(2m) + 0(s 2 m). Lines [6l [8l and [14l each require a single inverse transform, 
contributing a total of (3s — 2)T(2m). The explicitly stated forward transforms 
contribute (7s — l)T(2m). The various other operations, including applications of 
5 and S' 1 , contribute only 0{sm). The total is (13s - 4)T(2m) + (9(s 2 m). □ 

Theorem 2. Let f £ C[xJ with f — mod x. Then exp/ may be computed to 
order n in time (13/6 + o(l))M(n). 

Proof. Apply the proof of [Har09, Theorem 3] to Proposition [JJ with r — 2s. □ 
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